AITopics | soft mask

Collaborating Authors

soft mask

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PDP: Parameter-free Differentiable Pruning is All You Need

Neural Information Processing SystemsOct-9-2025, 01:21:49 GMT

Hence, a desirable pruning algorithm should achieve high accuracy and accelerate inference for various types of networks without significant training overheads in costs and complexity.

machine learning, natural language, pruning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Enhancing Physical Consistency in Lightweight World Models

Wang, Dingrui, Sun, Zhexiao, Li, Zhouheng, Wang, Cheng, Peng, Youlun, Ye, Hongyuan, Zarrouki, Baha, Li, Wei, Piccinini, Mattia, Xie, Lei, Betz, Johannes

arXiv.org Artificial IntelligenceSep-17-2025

A major challenge in deploying world models is the trade-off between size and performance. Large world models can capture rich physical dynamics but require massive computing resources, making them impractical for edge devices. Small world models are easier to deploy but often struggle to learn accurate physics, leading to poor predictions. We propose the Physics-Informed BEV World Model (PIWM), a compact model designed to efficiently capture physical interactions in bird's-eye-view (BEV) representations. PIWM uses Soft Mask during training to improve dynamic object modeling and future prediction. We also introduce a simple yet effective technique, Warm Start, for inference to enhance prediction quality with a zero-shot model. Experiments show that at the same parameter scale (400M), PIWM surpasses the baseline by 60.6% in weighted overall score. Moreover, even when compared with the largest baseline model (400M), the smallest PIWM (130M Soft Mask) achieves a 7.4% higher weighted overall score with a 28% faster inference speed.

artificial intelligence, soft mask, world model, (14 more...)

arXiv.org Artificial Intelligence

2509.12437

Country:

Asia > China (0.46)
Europe (0.46)

Genre: Research Report (0.82)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Masked Subspace Clustering Methods

Song, Jiebo, Ling, Huaming

arXiv.org Artificial IntelligenceMay-13-2025

To further utilize the unsupervised features and pairwise information, we propose a general Bilevel Clustering Optimization (BCO) framework to improve the performance of clustering. And then we introduce three special cases on subspace clustering with two different types of masks. At first, we reformulate the original subspace clustering as a Basic Masked Subspace Clustering (BMSC), which reformulate the diagonal constraints to a hard mask. Then, we provide a General Masked Subspace Clustering (GMSC) method to integrate different clustering via a soft mask. Furthermore, based on BCO and GMSC, we induce a learnable soft mask and design a Recursive Masked Subspace Clustering (RMSC) method that can alternately update the affinity matrix and the soft mask. Numerical experiments show that our models obtain significant improvement compared with the baselines on several commonly used datasets, such as MNIST, USPS, ORL, COIL20 and COIL100.

affinity matrix, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2505.06863

Country: North America > United States (0.35)

Genre: Research Report (0.64)

Industry: Government > Regional Government > North America Government > United States Government (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

PDP: Parameter-free Differentiable Pruning is All You Need

Cho, Minsik, Adya, Saurabh, Naik, Devang

arXiv.org Artificial IntelligenceNov-17-2023

DNN pruning is a popular way to reduce the size of a model, improve the inference latency, and minimize the power consumption on DNN accelerators. However, existing approaches might be too complex, expensive or ineffective to apply to a variety of vision/language tasks, DNN architectures and to honor structured pruning constraints. In this paper, we propose an efficient yet effective train-time pruning scheme, Parameter-free Differentiable Pruning (PDP), which offers state-of-the-art qualities in model size, accuracy, and training cost. PDP uses a dynamic function of weights during training to generate soft pruning masks for the weights in a parameter-free manner for a given pruning target. While differentiable, the simplicity and efficiency of PDP make it universal enough to deliver state-of-the-art random/structured/channel pruning results on various vision and natural language tasks. For example, for MobileNet-v1, PDP can achieve 68.2% top-1 ImageNet1k accuracy at 86.6% sparsity, which is 1.7% higher accuracy than those from the state-of-the-art algorithms. Also, PDP yields over 83.1% accuracy on Multi-Genre Natural Language Inference with 90% sparsity for BERT, while the next best from the existing techniques shows 81.5% accuracy. In addition, PDP can be applied to structured pruning, such as N:M pruning and channel pruning. For 1:4 structured pruning of ResNet18, PDP improved the top-1 ImageNet1k accuracy by over 3.6% over the state-of-the-art. For channel pruning of ResNet50, PDP reduced the top-1 ImageNet1k accuracy by 0.6% from the state-of-the-art.

accuracy, pdp, pruning, (14 more...)

arXiv.org Artificial Intelligence

2305.11203

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

xAI-CycleGAN, a Cycle-Consistent Generative Assistive Network

Sloboda, Tibor, Hudec, Lukáš, Benešová, Wanda

arXiv.org Artificial IntelligenceJun-27-2023

In the domain of unsupervised image-to-image transformation using generative transformative models, CycleGAN has become the architecture of choice. One of the primary downsides of this architecture is its relatively slow rate of convergence. In this work, we use discriminator-driven explainability to speed up the convergence rate of the generative model by using saliency maps from the discriminator that mask the gradients of the generator during backpropagation, based on the work of Nagisetty et al., and also introducing the saliency map on input, added onto a Gaussian noise mask, by using an interpretable latent variable based on Wang M.'s Mask CycleGAN. This allows for an explainability fusion in both directions, and utilizing the noise-added saliency map on input as evidence-based counterfactual filtering. This new architecture has much higher rate of convergence than a baseline CycleGAN architecture while preserving the image quality.

cyclegan, discriminator, generator, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-44137-0_33

2306.1576

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management

Gong, Yifan, Zhan, Zheng, Zhao, Pu, Wu, Yushu, Wu, Chao, Ding, Caiwen, Jiang, Weiwen, Qin, Minghai, Wang, Yanzhi

arXiv.org Artificial IntelligenceDec-9-2022

During the deployment of deep neural networks (DNNs) on edge devices, many research efforts are devoted to the limited hardware resource. However, little attention is paid to the influence of dynamic power management. As edge devices typically only have a budget of energy with batteries (rather than almost unlimited energy support on servers or workstations), their dynamic power management often changes the execution frequency as in the widely-used dynamic voltage and frequency scaling (DVFS) technique. This leads to highly unstable inference speed performance, especially for computation-intensive DNN models, which can harm user experience and waste hardware resources. We firstly identify this problem and then propose All-in-One, a highly representative pruning framework to work with dynamic power management using DVFS. The framework can use only one set of model weights and soft masks (together with other auxiliary parameters of negligible storage) to represent multiple models of various pruning ratios. By re-configuring the model to the corresponding pruning ratio for a specific execution frequency (and voltage), we are able to achieve stable inference speed, i.e., keeping the difference in speed performance under various execution frequencies as small as possible. Our experiments demonstrate that our method not only achieves high accuracy for multiple models of different pruning ratios, but also reduces their variance of inference latency for various frequencies, with minimal memory consumption of only one model and one soft mask.

artificial intelligence, frequency, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2212.05122

Country:

North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > Connecticut (0.04)

Genre: Research Report (0.64)

Industry: Electrical Industrial Apparatus (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

CWP: Instance complexity weighted channel-wise soft masks for network pruning

Wang, Jiapeng, Ma, Ming, Yu, Zhenhua

arXiv.org Artificial IntelligenceOct-11-2022

Existing differentiable channel pruning methods often attach scaling factors or masks behind channels to prune filters with less importance, and implicitly assume uniform contribution of input samples to filter importance. Specifically, the effects of instance complexity on pruning performance are not yet fully investigated in static network pruning. In this paper, we propose a simple yet effective differentiable network pruning method CWP based on instance complexity weighted filter importance scores. We define instance complexity related weight for each instance by giving higher weights to hard instances, and measure the weighted sum of instance-specific soft masks to model non-uniform contribution of different inputs, which encourages hard instances to dominate the pruning process and the model performance to be well preserved. In addition, we introduce a regularizer to maximize polarization of the masks, such that a sweet spot can be easily found to identify the filters to be pruned. Performance evaluations on various network architectures and datasets demonstrate CWP has advantages over the state-of-the-arts in pruning large networks. For instance, CWP improves the accuracy of ResNet56 on CIFAR-10 dataset by 0.32% aftering removing 64.11% FLOPs, and prunes 87.75% FLOPs of ResNet50 on ImageNet dataset with only 0.93% Top-1 accuracy loss.

artificial intelligence, machine learning, soft mask, (15 more...)

arXiv.org Artificial Intelligence

2209.03534

Country:

Asia > China > Ningxia Hui Autonomous Region > Yinchuan (0.04)
Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Communications > Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

KSM: Fast Multiple Task Adaption via Kernel-wise Soft Mask Learning

Yang, Li, He, Zhezhi, Zhang, Junshan, Fan, Deliang

arXiv.org Artificial IntelligenceSep-11-2020

Deep Neural Networks (DNN) could forget the knowledge about earlier tasks when learning new tasks, and this is known as \textit{catastrophic forgetting}. While recent continual learning methods are capable of alleviating the catastrophic problem on toy-sized datasets, some issues still remain to be tackled when applying them in real-world problems. Recently, the fast mask-based learning method (e.g. piggyback \cite{mallya2018piggyback}) is proposed to address these issues by learning only a binary element-wise mask in a fast manner, while keeping the backbone model fixed. However, the binary mask has limited modeling capacity for new tasks. A more recent work \cite{hung2019compacting} proposes a compress-grow-based method (CPG) to achieve better accuracy for new tasks by partially training backbone model, but with order-higher training cost, which makes it infeasible to be deployed into popular state-of-the-art edge-/mobile-learning. The primary goal of this work is to simultaneously achieve fast and high-accuracy multi task adaption in continual learning setting. Thus motivated, we propose a new training method called \textit{kernel-wise Soft Mask} (KSM), which learns a kernel-wise hybrid binary and real-value soft mask for each task, while using the same backbone model. Such a soft mask can be viewed as a superposition of a binary mask and a properly scaled real-value tensor, which offers a richer representation capability without low-level kernel support to meet the objective of low hardware overhead. We validate KSM on multiple benchmark datasets against recent state-of-the-art methods (e.g. Piggyback, Packnet, CPG, etc.), which shows good improvement in both accuracy and training cost.

artificial intelligence, backbone model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2009.05668

Country: North America > United States > Arizona > Maricopa County > Tempe (0.04)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback